{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": "true" }, "source": [ "# Table of Contents\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Requirements and helper functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Requirements\n", "\n", "This notebook requires to have numpy and matplotlib installed.\n", "I'm also exploring usage of numba and cython later, so they are also needed." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: watermark in /usr/local/lib/python3.6/dist-packages (1.5.0)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (1.14.5)\n", "Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (1.1.0)\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.6/dist-packages (3.0.2)\n", "Requirement already satisfied: numba in /usr/local/lib/python3.6/dist-packages (0.37.0)\n", "Requirement already satisfied: cython in /usr/local/lib/python3.6/dist-packages (0.27.2)\n", "Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (4.19.6)\n", "Requirement already satisfied: ipython in /usr/local/lib/python3.6/dist-packages (from watermark) (7.0.1)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (2.7.3)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (2.3.0)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (0.10.0)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib) (1.0.1)\n", "Requirement already satisfied: llvmlite>=0.22.0.dev0 in /usr/local/lib/python3.6/dist-packages (from numba) (0.22.0)\n", "Requirement already satisfied: simplegeneric>0.8 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.8.1)\n", "Requirement already satisfied: pygments in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (2.2.0)\n", "Requirement already satisfied: pexpect; sys_platform != \"win32\" in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (4.6.0)\n", "Requirement already satisfied: jedi>=0.10 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.12.1)\n", "Requirement already satisfied: backcall in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.1.0)\n", "Requirement already satisfied: pickleshare in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (0.7.5)\n", "Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (4.3.2)\n", "Requirement already satisfied: decorator in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (4.3.0)\n", "Requirement already satisfied: prompt-toolkit<2.1.0,>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (2.0.4)\n", "Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.6/dist-packages (from ipython->watermark) (40.5.0)\n", "Requirement already satisfied: six>=1.5 in /home/lilian/.local/lib/python3.6/site-packages (from python-dateutil>=2.1->matplotlib) (1.11.0)\n", "Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.6/dist-packages (from pexpect; sys_platform != \"win32\"->ipython->watermark) (0.6.0)\n", "Requirement already satisfied: parso>=0.3.0 in /usr/local/lib/python3.6/dist-packages (from jedi>=0.10->ipython->watermark) (0.3.1)\n", "Requirement already satisfied: ipython-genutils in /usr/local/lib/python3.6/dist-packages (from traitlets>=4.2->ipython->watermark) (0.2.0)\n", "Requirement already satisfied: wcwidth in /usr/local/lib/python3.6/dist-packages (from prompt-toolkit<2.1.0,>=2.0.0->ipython->watermark) (0.1.7)\n", "Lilian Besson \n", "\n", "CPython 3.6.7\n", "IPython 7.0.1\n", "\n", "numpy 1.14.5\n", "scipy 1.1.0\n", "matplotlib 3.0.2\n", "numba 0.37.0\n", "cython 0.27.2\n", "tqdm 4.19.6\n", "\n", "compiler : GCC 8.2.0\n", "system : Linux\n", "release : 4.15.0-38-generic\n", "machine : x86_64\n", "processor : x86_64\n", "CPU cores : 4\n", "interpreter: 64bit\n" ] } ], "source": [ "!pip install watermark numpy scipy matplotlib numba cython tqdm\n", "%load_ext watermark\n", "%watermark -v -m -p numpy,scipy,matplotlib,numba,cython,tqdm -a \"Lilian Besson\"" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import numba" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "code_folding": [ 0 ] }, "outputs": [], "source": [ "def in_notebook():\n", " \"\"\"Check if the code is running inside a Jupyter notebook or not. Cf. http://stackoverflow.com/a/39662359/.\n", "\n", " >>> in_notebook()\n", " False\n", " \"\"\"\n", " try:\n", " shell = get_ipython().__class__.__name__\n", " if shell == 'ZMQInteractiveShell': # Jupyter notebook or qtconsole?\n", " return True\n", " elif shell == 'TerminalInteractiveShell': # Terminal running IPython?\n", " return False\n", " else:\n", " return False # Other type (?)\n", " except NameError:\n", " return False # Probably standard Python interpreter" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Info: Using the Jupyter notebook version of the tqdm() decorator, tqdm_notebook() ...\n" ] } ], "source": [ "if in_notebook():\n", " from tqdm import tqdm_notebook as tqdm\n", " print(\"Info: Using the Jupyter notebook version of the tqdm() decorator, tqdm_notebook() ...\") # DEBUG\n", "else:\n", " from tqdm import tqdm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mathematical notations for stationary problems\n", "\n", "We consider $K \\geq 1$ arms, which are distributions $\\nu_k$.\n", "We focus on Bernoulli distributions, which are characterized by their means, $\\nu_k = \\mathcal{B}(\\mu_k)$ for $\\mu_k\\in[0,1]$.\n", "A stationary bandit problem is defined here by the vector $[\\mu_1,\\dots,\\mu_K]$.\n", "\n", "For a fixed problem and a *horizon* $T\\in\\mathbb{N}$, $T\\geq1$, we draw samples from the $K$ distributions to get *data*: $\\forall t, r_k(t) \\sim \\nu_k$, ie, $\\mathbb{P}(r_k(t) = 1) = \\mu_k$ and $r_k(t) \\in \\{0,1\\}$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generating fake stationary data\n", "\n", "Here we give some examples of stationary problems and examples of data we can draw from them." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def bernoulli_samples(means, horizon=1000):\n", " if np.size(means) == 1:\n", " return np.random.binomial(1, means, size=horizon)\n", " else:\n", " results = np.zeros((np.size(means), horizon))\n", " for i, mean in enumerate(means):\n", " results[i] = np.random.binomial(1, mean, size=horizon)\n", " return results" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "problem1 = [0.5]\n", "\n", "bernoulli_samples(problem1, horizon=20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For bandit problem with $K \\geq 2$ arms, the *goal* is to design an online learning algorithm that roughly do the following:\n", "\n", "- For time $t=1$ to $t=T$ (unknown horizon)\n", " 1. Algorithm $A$ decide to draw arm $A(t) \\in\\{1,\\dots,K\\}$,\n", " 2. Get the reward $r(t) = r_{A(t)}(t) \\sim \\nu_{A(t)}$ from the (Bernoulli) distribution of that arm,\n", " 3. Give this observation of reward $r(t)$ coming from arm $A(t)$ to the algorithm,\n", " 4. Update internal state of the algorithm\n", "\n", "An algorithm is efficient if it obtains a high (expected) sum reward, ie, $\\sum_{t=1}^T r(t)$." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n", " 0., 0., 0., 0.],\n", " [1., 1., 1., 1., 0., 1., 1., 1., 0., 0., 0., 0., 0., 0., 1., 1.,\n", " 0., 1., 1., 0.],\n", " [1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n", " 1., 1., 0., 1.]])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "problem2 = [0.1, 0.5, 0.9]\n", "\n", "bernoulli_samples(problem2, horizon=20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For instance on these data, the best arm is clearly the third one, with expected reward of $\\mu^* = \\max_k \\mu_k = 0.9$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mathematical notations for piecewise stationary problems\n", "\n", "Now we fix the horizon $T\\in\\mathbb{N}$, $T\\geq1$ and we also consider a set of $\\Upsilon_T$ *break points*, $\\tau_1,\\dots,\\tau_{\\Upsilon_T} \\in\\{1,\\dots,T\\}$. We denote $\\tau_0 = 0$ and $\\tau_{\\Upsilon_T+1} = T$ for convenience of notations.\n", "We can assume that breakpoints are far \"enough\" from each other, for instance that there exists an integer $N\\in\\mathbb{N},N\\geq1$ such that $\\min_{i=0}^{\\Upsilon_T} \\tau_{i+1} - \\tau_i \\geq N K$. That is, on each *stationary interval*, a uniform sampling of the $K$ arms gives at least $N$ samples by arm.\n", "\n", "Now, in any stationary interval $[\\tau_i + 1, \\tau_{i+1}]$, the $K \\geq 1$ arms are distributions $\\nu_k^{(i)}$.\n", "We focus on Bernoulli distributions, which are characterized by their means, $\\nu_k^{(i)} := \\mathcal{B}(\\mu_k^{(i)})$ for $\\mu_k^{(i)}\\in[0,1]$.\n", "A piecewise stationary bandit problem is defined here by the vector $[\\mu_k^{(i)}]_{1\\leq k \\leq K, 1 \\leq i \\leq \\Upsilon_T}$.\n", "\n", "For a fixed problem and a *horizon* $T\\in\\mathbb{N}$, $T\\geq1$, we draw samples from the $K$ distributions to get *data*: $\\forall t, r_k(t) \\sim \\nu_k^{(i)}$ for $i$ the unique index of stationary interval such that $t\\in[\\tau_i + 1, \\tau_{i+1}]$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generating fake piecewise stationary data\n", "\n", "The format to define piecewise stationary problem will be the following. It is compact but generic!\n", "\n", "The first example considers a unique arm, with 2 breakpoints uniformly spaced.\n", "- On the first interval, for instance from $t=1$ to $t=500$, that is $\\tau_1 = 500$, $\\mu_1^{(1)} = 0.1$,\n", "- On the second interval, for instance from $t=501$ to $t=1000$, that is $\\tau_2 = 100$, $\\mu_1^{(2)} = 0.5$,\n", "- On the third interval, for instance from $t=1001$ to $t=1500$, that $\\mu_1^{(3)} = 0.9$." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# With 1 arm only!\n", "problem_piecewise_0 = lambda horizon: {\n", " \"listOfMeans\": [\n", " [0.1], # 0 to 499\n", " [0.5], # 500 to 999\n", " [0.8], # 1000 to 1499\n", " ],\n", " \"changePoints\": [\n", " int(0 * horizon / 1500.0),\n", " int(500 * horizon / 1500.0),\n", " int(1000 * horizon / 1500.0),\n", " ],\n", "}" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# With 2 arms\n", "problem_piecewise_1 = lambda horizon: {\n", " \"listOfMeans\": [\n", " [0.1, 0.2], # 0 to 399\n", " [0.1, 0.3], # 400 to 799\n", " [0.5, 0.3], # 800 to 1199\n", " [0.4, 0.3], # 1200 to 1599\n", " [0.3, 0.9], # 1600 to end\n", " ],\n", " \"changePoints\": [\n", " int(0 * horizon / 2000.0),\n", " int(400 * horizon / 2000.0),\n", " int(800 * horizon / 2000.0),\n", " int(1200 * horizon / 2000.0),\n", " int(1600 * horizon / 2000.0),\n", " ],\n", "}" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "code_folding": [ 1 ] }, "outputs": [], "source": [ "# With 3 arms\n", "problem_piecewise_2 = lambda horizon: {\n", " \"listOfMeans\": [\n", " [0.2, 0.5, 0.9], # 0 to 399\n", " [0.2, 0.2, 0.9], # 400 to 799\n", " [0.2, 0.2, 0.1], # 800 to 1199\n", " [0.7, 0.2, 0.1], # 1200 to 1599\n", " [0.7, 0.5, 0.1], # 1600 to end\n", " ],\n", " \"changePoints\": [\n", " int(0 * horizon / 2000.0),\n", " int(400 * horizon / 2000.0),\n", " int(800 * horizon / 2000.0),\n", " int(1200 * horizon / 2000.0),\n", " int(1600 * horizon / 2000.0),\n", " ],\n", "}" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "code_folding": [ 1 ] }, "outputs": [], "source": [ "# With 3 arms\n", "problem_piecewise_3 = lambda horizon: {\n", " \"listOfMeans\": [\n", " [0.4, 0.5, 0.9], # 0 to 399\n", " [0.5, 0.4, 0.7], # 400 to 799\n", " [0.6, 0.3, 0.5], # 800 to 1199\n", " [0.7, 0.2, 0.3], # 1200 to 1599\n", " [0.8, 0.1, 0.1], # 1600 to end\n", " ],\n", " \"changePoints\": [\n", " int(0 * horizon / 2000.0),\n", " int(400 * horizon / 2000.0),\n", " int(800 * horizon / 2000.0),\n", " int(1200 * horizon / 2000.0),\n", " int(1600 * horizon / 2000.0),\n", " ],\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can write a utility function that transform this compact representation into a full list of means." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "code_folding": [] }, "outputs": [], "source": [ "def getFullHistoryOfMeans(problem, horizon=2000):\n", " \"\"\"Return the vector of mean of the arms, for a piece-wise stationary MAB.\n", "\n", " - It is a numpy array of shape (nbArms, horizon).\n", " \"\"\"\n", " pb = problem(horizon)\n", " listOfMeans, changePoints = pb['listOfMeans'], pb['changePoints']\n", " nbArms = len(listOfMeans[0])\n", " if horizon is None:\n", " horizon = np.max(changePoints)\n", " meansOfArms = np.ones((nbArms, horizon))\n", " for armId in range(nbArms):\n", " nbChangePoint = 0\n", " for t in range(horizon):\n", " if nbChangePoint < len(changePoints) - 1 and t >= changePoints[nbChangePoint + 1]:\n", " nbChangePoint += 1\n", " meansOfArms[armId][t] = listOfMeans[nbChangePoint][armId]\n", " return meansOfArms" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For examples :" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n", " 0.1, 0.1, 0.1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n", " 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8,\n", " 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8]])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getFullHistoryOfMeans(problem_piecewise_0, horizon=50)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n", " 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n", " 0.5, 0.5, 0.5, 0.5, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4,\n", " 0.4, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3],\n", " [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3,\n", " 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n", " 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n", " 0.3, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9]])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getFullHistoryOfMeans(problem_piecewise_1, horizon=50)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n", " 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n", " 0.2, 0.2, 0.2, 0.2, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7,\n", " 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7],\n", " [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.2, 0.2, 0.2,\n", " 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n", " 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n", " 0.2, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],\n", " [0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9,\n", " 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n", " 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n", " 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getFullHistoryOfMeans(problem_piecewise_2, horizon=50)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5,\n", " 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.6, 0.6, 0.6, 0.6, 0.6, 0.6,\n", " 0.6, 0.6, 0.6, 0.6, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7,\n", " 0.7, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8],\n", " [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.4, 0.4, 0.4,\n", " 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n", " 0.3, 0.3, 0.3, 0.3, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,\n", " 0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1],\n", " [0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.9, 0.7, 0.7, 0.7,\n", " 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n", " 0.5, 0.5, 0.5, 0.5, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3,\n", " 0.3, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getFullHistoryOfMeans(problem_piecewise_3, horizon=50)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now we need to be able to generate samples from such distributions." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "def piecewise_bernoulli_samples(problem, horizon=1000):\n", " fullMeans = getFullHistoryOfMeans(problem, horizon=horizon)\n", " nbArms, horizon = np.shape(fullMeans)\n", " results = np.zeros((nbArms, horizon))\n", " for i in range(nbArms):\n", " mean_i = fullMeans[i, :]\n", " for t in range(horizon):\n", " mean_i_t = mean_i[t]\n", " results[i, t] = np.random.binomial(1, mean_i_t)\n", " return results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Examples:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n", " 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,\n", " 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n", " 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n", " 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,\n", " 0.5, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8,\n", " 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8,\n", " 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8]])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/plain": [ "array([[0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n", " 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n", " 0., 1., 1., 1., 1., 0., 0., 1., 1., 0., 1., 0., 0., 0., 1., 0.,\n", " 0., 0., 1., 1., 1., 1., 0., 1., 1., 0., 1., 0., 1., 1., 1., 1.,\n", " 1., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1.,\n", " 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1.,\n", " 1., 1., 0., 1.]])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getFullHistoryOfMeans(problem_piecewise_0, horizon=100)\n", "piecewise_bernoulli_samples(problem_piecewise_0, horizon=100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We easily spot the (approximate) location of the breakpoint!\n", "\n", "Another example:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n", " 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n", " 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1.,\n", " 0., 1., 1., 0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 1.,\n", " 1., 1., 0., 1., 0., 0., 1., 0., 1., 1., 0., 0., 0., 1., 1., 1.,\n", " 1., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1., 0.,\n", " 1., 0., 1., 1.],\n", " [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0.,\n", " 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 1.,\n", " 1., 0., 0., 0., 1., 0., 0., 0., 1., 1., 1., 1., 0., 0., 1., 0.,\n", " 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 1., 1., 0., 1.,\n", " 1., 1., 0., 1., 1., 1., 0., 0., 0., 0., 1., 1., 0., 1., 0., 1.,\n", " 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,\n", " 0., 1., 0., 1.]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "piecewise_bernoulli_samples(problem_piecewise_1, horizon=100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# Python implementations of some statistical tests\n", "\n", "I will implement here the following statistical tests (and I give a link to the implementation of the correspond bandit policy in my framework [`SMPyBandits`](https://smpybandits.github.io/)\n", "\n", "- Monitored (based on a McDiarmid inequality), for Monitored-UCB or [`M-UCB`](),\n", "- CUSUM, for [`CUSUM-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=cusum#Policies.CD_UCB.CUSUM_IndexPolicy),\n", "- PHT, for [`PHT-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=cusum#Policies.CD_UCB.PHT_IndexPolicy),\n", "- Gaussian GLR, for [`GaussianGLR-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=glr#Policies.CD_UCB.GaussianGLR_IndexPolicy),\n", "- Bernoulli GLR, for [`BernoulliGLR-UCB`](https://smpybandits.github.io/docs/Policies.CD_UCB.html?highlight=glr#Policies.CD_UCB.BernoulliGLR_IndexPolicy)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A stupid detection test (pure random!)\n", "Just to be sure that the test functions work as wanted, I start by writing a stupid change detection test, which is purely random!" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "def PurelyRandom(all_data, t, proba=0.5):\n", " return np.random.random() < proba" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Monitored" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "NB_ARMS = 1\n", "WINDOW_SIZE = 80" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "def Monitored(all_data, t,\n", " window_size=WINDOW_SIZE, threshold_b=None,\n", " ):\n", " r\"\"\" A change is detected for the current arm if the following test is true:\n", "\n", " .. math:: |\\sum_{i=w/2+1}^{w} Y_i - \\sum_{i=1}^{w/2} Y_i | > b ?\n", "\n", " - where :math:`Y_i` is the i-th data in the latest w data from this arm (ie, :math:`X_k(t)` for :math:`t = n_k - w + 1` to :math:`t = n_k` current number of samples from arm k).\n", " - where :attr:`threshold_b` is the threshold b of the test, and :attr:`window_size` is the window-size w.\n", " \"\"\"\n", " data = all_data[:t]\n", " # don't try to detect change if there is not enough data!\n", " if len(data) < window_size:\n", " return False\n", " \n", " # compute parameters\n", " horizon = len(all_data)\n", " if threshold_b is None:\n", " threshold_b = np.sqrt(window_size/2 * np.log(2 * NB_ARMS * horizon**2))\n", "\n", " last_w_data = data[-window_size:]\n", " sum_first_half = np.sum(last_w_data[:window_size//2])\n", " sum_second_half = np.sum(last_w_data[window_size//2:])\n", " return abs(sum_first_half - sum_second_half) > threshold_b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CUSUM" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "#: Precision of the test.\n", "EPSILON = 0.5\n", "\n", "#: Default value of :math:`\\lambda`.\n", "LAMBDA = 1\n", "\n", "#: Hypothesis on the speed of changes: between two change points, there is at least :math:`M * K` time steps, where K is the number of arms, and M is this constant.\n", "MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT = 100\n", "\n", "MAX_NB_RANDOM_EVENTS = 1" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "from scipy.special import comb" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "code_folding": [] }, "outputs": [], "source": [ "def compute_h_alpha__CUSUM(horizon, \n", " verbose=False,\n", " M=MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT,\n", " max_nb_random_events=MAX_NB_RANDOM_EVENTS,\n", " nbArms=1,\n", " epsilon=EPSILON,\n", " lmbda=LAMBDA,\n", " ):\n", " r\"\"\" Compute the values :math:`C_1^+, C_1^-, C_1, C_2, h` from the formulas in Theorem 2 and Corollary 2 in the paper.\"\"\"\n", " T = int(max(1, horizon))\n", " UpsilonT = int(max(1, max_nb_random_events))\n", " K = int(max(1, nbArms))\n", " # print(\"compute_h_alpha__CUSUM() with:\\nT = {}, UpsilonT = {}, K = {}, epsilon = {}, lmbda = {}, M = {}\".format(T, UpsilonT, K, epsilon, lmbda, M)) # DEBUG\n", " C2 = np.log(3) + 2 * np.exp(- 2 * epsilon**2 * M) / lmbda\n", " C1_minus = np.log(((4 * epsilon) / (1-epsilon)**2) * comb(M, int(np.floor(2 * epsilon * M))) * (2 * epsilon)**M + 1)\n", " C1_plus = np.log(((4 * epsilon) / (1+epsilon)**2) * comb(M, int(np.ceil(2 * epsilon * M))) * (2 * epsilon)**M + 1)\n", " C1 = min(C1_minus, C1_plus)\n", " if C1 == 0: C1 = 1 # FIXME\n", " h = 1/C1 * np.log(T / UpsilonT)\n", " alpha = K * np.sqrt((C2 * UpsilonT)/(C1 * T) * np.log(T / UpsilonT))\n", " alpha *= 0.01 # FIXME Just divide alpha to not have too large\n", " alpha = max(0, min(1, alpha)) # crop to [0, 1]\n", " # print(\"Gave C2 = {}, C1- = {} and C1+ = {} so C1 = {}, and h = {} and alpha = {}\".format(C2, C1_minus, C1_plus, C1, h, alpha)) # DEBUG\n", " return h, alpha" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def CUSUM(all_data, t,\n", " epsilon=EPSILON,\n", " M=MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT,\n", " threshold_h=None,\n", " ):\n", " r\"\"\" Detect a change in the current arm, using the two-sided CUSUM algorithm [Page, 1954].\n", "\n", " - For each *data* k, compute:\n", "\n", " .. math::\n", "\n", " s_k^- &= (y_k - \\hat{u}_0 - \\varepsilon) 1(k > M),\\\\\n", " s_k^+ &= (\\hat{u}_0 - y_k - \\varepsilon) 1(k > M),\\\\\n", " g_k^+ &= max(0, g_{k-1}^+ + s_k^+),\\\\\n", " g_k^- &= max(0, g_{k-1}^- + s_k^-),\\\\\n", "\n", " - The change is detected if :math:`\\max(g_k^+, g_k^-) > h`, where :attr:`threshold_h` is the threshold of the test,\n", " - And :math:`\\hat{u}_0 = \\frac{1}{M} \\sum_{k=1}^{M} y_k` is the mean of the first M samples, where M is :attr:`M` the min number of observation between change points.\n", " \"\"\"\n", " data = all_data[:t]\n", " \n", " # compute parameters\n", " horizon = len(all_data)\n", " if threshold_h is None:\n", " threshold_h, _ = compute_h_alpha__CUSUM(horizon, M, 1, epsilon=epsilon)\n", "\n", " gp, gm = 0, 0\n", " # First we use the first M samples to calculate the average :math:`\\hat{u_0}`.\n", " u0hat = np.mean(data[:M])\n", " for k, y_k in enumerate(data):\n", " if k <= M:\n", " continue\n", " sp = u0hat - y_k - epsilon # no need to multiply by (k > self.M)\n", " sm = y_k - u0hat - epsilon # no need to multiply by (k > self.M)\n", " gp, gm = max(0, gp + sp), max(0, gm + sm)\n", " if max(gp, gm) >= threshold_h:\n", " return True\n", " return False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PHT" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "def PHT(all_data, t,\n", " epsilon=EPSILON,\n", " M=MIN_NUMBER_OF_OBSERVATION_BETWEEN_CHANGE_POINT,\n", " threshold_h=None,\n", " ):\n", " r\"\"\" Detect a change in the current arm, using the two-sided PHT algorithm [Hinkley, 1971].\n", "\n", " - For each *data* k, compute:\n", "\n", " .. math::\n", "\n", " s_k^- &= y_k - \\hat{y}_k - \\varepsilon,\\\\\n", " s_k^+ &= \\hat{y}_k - y_k - \\varepsilon,\\\\\n", " g_k^+ &= max(0, g_{k-1}^+ + s_k^+),\\\\\n", " g_k^- &= max(0, g_{k-1}^- + s_k^-),\\\\\n", "\n", " - The change is detected if :math:`\\max(g_k^+, g_k^-) > h`, where :attr:`threshold_h` is the threshold of the test,\n", " - And :math:`\\hat{y}_k = \\frac{1}{k} \\sum_{s=1}^{k} y_s` is the mean of the first k samples.\n", " \"\"\"\n", " data = all_data[:t]\n", " \n", " # compute parameters\n", " horizon = len(all_data)\n", " if threshold_h is None:\n", " threshold_h, _ = compute_h_alpha__CUSUM(horizon, M, 1, epsilon=epsilon)\n", "\n", " gp, gm = 0, 0\n", " # First we use the first M samples to calculate the average :math:`\\hat{u_0}`.\n", " for k, y_k in enumerate(data):\n", " y_k_hat = np.mean(data[:k])\n", " sp = y_k_hat - y_k - epsilon\n", " sm = y_k - y_k_hat - epsilon\n", " gp, gm = max(0, gp + sp), max(0, gm + sm)\n", " if max(gp, gm) >= threshold_h:\n", " return True\n", " return False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian GLR" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "def compute_c_alpha__GLR(t0, t, horizon, verbose=False, exponentBeta=1.05, alpha_t1=0.1):\n", " r\"\"\" Compute the values :math:`c, \\alpha` from the corollary of of Theorem 2 from [\"Sequential change-point detection: Laplace concentration of scan statistics and non-asymptotic delay bounds\", O.-A. Maillard, 2018].\n", "\n", " .. note:: I am currently exploring the following variant (November 2018):\n", "\n", " - The probability of uniform exploration, :math:`\\alpha`, is computed as a function of the current time:\n", "\n", " .. math:: \\forall t>0, \\alpha = \\alpha_t := \\alpha_{t=1} \\frac{1}{\\max(1, t^{\\beta})}.\n", "\n", " - with :math:`\\beta > 1, \\beta` = ``exponentBeta`` (=1.05) and :math:`\\alpha_{t=1} < 1, \\alpha_{t=1}` = ``alpha_t1`` (=0.01).\n", " \"\"\"\n", " T = int(max(1, horizon))\n", " delta = 1.0 / T\n", " if verbose: print(\"compute_c_alpha__GLR() with t = {}, t0 = {}, T = {}, delta = 1/T = {}\".format(t, t0, T, delta)) # DEBUG\n", " t_m_t0 = abs(t - t0)\n", " c = (1 + (1 / (t_m_t0 + 1.0))) * 2 * np.log((2 * t_m_t0 * np.sqrt(t_m_t0 + 2)) / delta)\n", " if c < 0 and np.isinf(c): c = float('+inf')\n", " assert exponentBeta > 1.0, \"Error: compute_c_alpha__GLR should have a exponentBeta > 1 but it was given = {}...\".format(exponentBeta) # DEBUG\n", " alpha = alpha_t1 / max(1, t)**exponentBeta\n", " if verbose: print(\"Gave c = {} and alpha = {}\".format(c, alpha)) # DEBUG\n", " return c, alpha" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "def klGauss(x, y, sig2x=0.25):\n", " r\"\"\" Kullback-Leibler divergence for Gaussian distributions of means ``x`` and ``y`` and variances ``sig2x`` and ``sig2y``, :math:`\\nu_1 = \\mathcal{N}(x, \\sigma_x^2)` and :math:`\\nu_2 = \\mathcal{N}(y, \\sigma_x^2)`:\n", "\n", " .. math:: \\mathrm{KL}(\\nu_1, \\nu_2) = \\frac{(x - y)^2}{2 \\sigma_y^2} + \\frac{1}{2}\\left( \\frac{\\sigma_x^2}{\\sigma_y^2} - 1 \\log\\left(\\frac{\\sigma_x^2}{\\sigma_y^2}\\right) \\right).\n", "\n", " See https://en.wikipedia.org/wiki/Normal_distribution#Other_properties\n", "\n", " - sig2y = sig2x (same variance).\n", " \"\"\"\n", " return (x - y) ** 2 / (2. * sig2x)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "def GaussianGLR(all_data, t,\n", " threshold_h=None,\n", " ):\n", " r\"\"\" Detect a change in the current arm, using the Generalized Likelihood Ratio test (GLR) and the :attr:`kl` function.\n", "\n", " - For each *time step* :math:`s` between :math:`t_0=0` and :math:`t`, compute:\n", "\n", " .. math::\n", "\n", " G^{\\mathcal{N}_1}_{t_0:s:t} = (s-t_0+1)(t-s) \\mathrm{kl}(\\mu_{s+1,t}, \\mu_{t_0,s}) / (t-t_0+1).\n", "\n", " - The change is detected if there is a time :math:`s` such that :math:`G^{\\mathcal{N}_1}_{t_0:s:t} > h`, where :attr:`threshold_h` is the threshold of the test,\n", " - And :math:`\\mu_{a,b} = \\frac{1}{b-a+1} \\sum_{s=a}^{b} y_s` is the mean of the samples between :math:`a` and :math:`b`.\n", " \"\"\"\n", " data = all_data[:t]\n", " t0 = 0\n", " horizon = len(all_data)\n", " \n", " # compute parameters\n", " if threshold_h is None:\n", " threshold_h, _ = compute_c_alpha__GLR(0, t, horizon)\n", "\n", " mu = lambda a, b: np.mean(data[a : b+1])\n", " for s in range(t0, t - 1):\n", " this_kl = klGauss(mu(s+1, t), mu(t0, s))\n", " glr = ((s - t0 + 1) * (t - s) / (t - t0 + 1)) * this_kl\n", " if glr >= threshold_h:\n", " return True\n", " return False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bernoulli GLR" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "eps = 1e-6 #: Threshold value: everything in [0, 1] is truncated to [eps, 1 - eps]\n", "\n", "def klBern(x, y):\n", " r\"\"\" Kullback-Leibler divergence for Bernoulli distributions. https://en.wikipedia.org/wiki/Bernoulli_distribution#Kullback.E2.80.93Leibler_divergence\n", "\n", " .. math:: \\mathrm{KL}(\\mathcal{B}(x), \\mathcal{B}(y)) = x \\log(\\frac{x}{y}) + (1-x) \\log(\\frac{1-x}{1-y}).\"\"\"\n", " x = min(max(x, eps), 1 - eps)\n", " y = min(max(y, eps), 1 - eps)\n", " return x * np.log(x / y) + (1 - x) * np.log((1 - x) / (1 - y))" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "def BernoulliGLR(all_data, t,\n", " threshold_h=None,\n", " ):\n", " r\"\"\" Detect a change in the current arm, using the Generalized Likelihood Ratio test (GLR) and the :attr:`kl` function.\n", "\n", " - For each *time step* :math:`s` between :math:`t_0=0` and :math:`t`, compute:\n", "\n", " .. math::\n", "\n", " G^{\\mathcal{N}_1}_{t_0:s:t} = (s-t_0+1)(t-s) \\mathrm{kl}(\\mu_{s+1,t}, \\mu_{t_0,s}) / (t-t_0+1).\n", "\n", " - The change is detected if there is a time :math:`s` such that :math:`G^{\\mathcal{N}_1}_{t_0:s:t} > h`, where :attr:`threshold_h` is the threshold of the test,\n", " - And :math:`\\mu_{a,b} = \\frac{1}{b-a+1} \\sum_{s=a}^{b} y_s` is the mean of the samples between :math:`a` and :math:`b`.\n", " \"\"\"\n", " data = all_data[:t]\n", " t0 = 0\n", " horizon = len(all_data)\n", " \n", " # compute parameters\n", " if threshold_h is None:\n", " threshold_h, _ = compute_c_alpha__GLR(0, t, horizon)\n", "\n", " mu = lambda a, b: np.mean(data[a : b+1])\n", " for s in range(t0, t - 1):\n", " this_kl = klBern(mu(s+1, t), mu(t0, s))\n", " glr = ((s - t0 + 1) * (t - s) / (t - t0 + 1)) * this_kl\n", " if glr >= threshold_h:\n", " return True\n", " return False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## List of all Python algorithms" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "all_CD_algorithms = [\n", " PurelyRandom,\n", " Monitored, CUSUM, PHT, GaussianGLR, BernoulliGLR\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# Numba implementations of some statistical tests\n", "\n", "I should try to use the [`numba.jit`](https://numba.pydata.org/numba-doc/latest/reference/jit-compilation.html#numba.jit) decorator for all the functions defined above." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "import numba" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "@numba.jit(nopython=True)\n", "def klBern_numba(x, y):\n", " x = min(max(x, eps), 1 - eps)\n", " y = min(max(y, eps), 1 - eps)\n", " return x * np.log(x / y) + (1 - x) * np.log((1 - x) / (1 - y))" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "@numba.jit(nopython=True)\n", "def klGauss_numba(x, y, sig2x=0.25):\n", " return (x - y) ** 2 / (2. * sig2x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "